Comparison between Expert Listeners and Continuous Speech Recognizers in Selecting Pronunciation Variants
نویسندگان
چکیده
In this paper, the performance of an automatic transcription tool corpus is by modeling pronunciation variation [2]. is evaluated. The transcription tool is a continuous speech Another way of obtaining models which are less recognizer (CSR) which can be used to select pronunciation contaminated is to train PMs on read speech. It is well known variants (i.e. detect insertions and deletions of phones). The that the extent of variation in spontaneous speech is larger than in performance of the CSR was compared to a reference read speech. So, for read speech there will be fewer mismatches transcription based on the judgments of expert listeners. We between the speech signal and the transcriptions. Thus, it is to be investigated to what extent the degree of agreement between the expected that PMs which are trained on read speech will be less listeners and the CSR was affected by employing various sets of contaminated than those trained on spontaneous speech. phone models (PMs). Overall, the PMs perform more similarly to One can imagine that PMs with varying degrees of the listeners when pronunciation variation is modeled. However, contamination may cause the CSR to select different the various sets of PMs lead to different results for insertion and pronunciation variants. As a consequence, the degree of deletion processes. Furthermore, we found that to a certain agreement between the CSR and the reference transcription may degree, word error rates can be used to predict which set of PMs vary as a function of the PMs employed. The purpose of the to use in the transcription tool. present study is to investigate to what extent the degree of
منابع مشابه
The selection of pronunciation variants: comparing the performance of man and machine
In this paper the performance of an automatic transcription tool is evaluated. The transcription tool is a Continuous Speech Recognizer (CSR) running in forced recognition mode. For evaluation the performance of the CSR was compared to that of nine expert listeners. Both man and the machine carried out exactly the same task: deciding whether a segment was present or not in 467 cases. It turned ...
متن کاملAutomatic Generation of Pronunciation Dictionaries
In this report we will describe a data driven approach for creating pronunciation dictionaries for a new unseen target language by voting among phoneme recognizers in nine different languages other than the target language. In this process recordings of the new language that are transcribed on word level are decoded by the phoneme recognizers. This results in a hypothesis of nine phonemes per t...
متن کاملThe roles of reconstruction and lexical storage in the comprehension of regular pronunciation variants
This paper investigates how listeners process regular pronunciation variants, resulting from simple general reduction processes. Study 1 shows that when listeners are presented with new words, they store the pronunciation variants presented to them, whether these are unreduced or reduced. Listeners thus store information on word-specific pronunciation variation. Study 2 suggests that if partici...
متن کاملAccent and television journalism: evidence for the practice of speech language pathologists and audiologists.
PURPOSE To analyze the preferences and attitudes of listeners in relation to regional (RA) and softened accents (SA) in television journalism. METHODS Three television news presenters recorded carrier phrases and a standard text using RA and SA. The recordings were presented to 105 judges who listened to the word pairs and answered whether they perceived differences between the RA and SA, and...
متن کاملAutomatic text-independent pronunciation scoring of foreign language student speech
SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorithm development. The automatic grading system uses SRI’s DecipherTM continuous speech recognition s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999